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ABSTRACT 


The  convergence  characteristics  of  an  iterative  method  for  solving  area  search  games 
were  investigated.  This  method,  Fictitious  Play,  was  first  introduced  by  G.  W.  Brown 
and  solves  two-person  zero-sum  games  by  having  each  player  sequentially  select  a  pure 
strategy  based  on  the  combined  past  actions  of  his  opponent.  The  Fictitious  Play 
method  was  successfully  implemented  for  an  area  search  game  in  which  two  players,  a 
searcher  and  a  target,  move  independently  through  an  area.  In  this  game,  the  payoff  is 
the  number  of  detections  of  the  target  by  the  searcher.  For  each  iteration  of  the  game, 
an  upper  and  lower  bound  on  the  value  of  the  game  were  determined  and  as  the  number 
of  iterations  of  the  game  increased,  these  bounds  converged  to  the  actual  solution.  In 
the  games  examined,  t!*e  convergence  of  the  bounds  was  closely  approximated  by  a 
power  function  (anP),  with  large  games  converging  more  slowly.  Because  of  the  ob¬ 
served  symmetrical  convergence  of  the  bounds,  an  accurate  approximation  of  the  value 
of  the  game  was  obtainable  from  the  average  of  the  upper  and  lower  bounds. 
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I.  INTRODUCTION 


A.  BACKGROUND 

The  convergence  properties  of  a  computational  method  for  solving  finite  matrix 
games  was  investigated.  This  method,  Fictitious  Play,  was  introduced  by  George  W. 
Brown  [Ref.  1]  and  is  an  iterative  method  based  on  the  imagined  play  of  the  two  game 
participants.  At  each  fictitious  play  iteration,  Brown's  technique  computes  upper  and 
lower  bounds  on  the  value  of  the  game  and  approximates  the  optimal  strategy  for  each 
player.  For  the  games  examined  here,  the  convergence  characteristics  of  the  upper  and 
lower  bounds  allow'  for  an  accurate  approximation  of  the  value  of  the  game  after  rela¬ 
tively  few'  iterations.  Although  the  convergence  rate  of  the  bounds  to  the  value  of  the 
game  is  slow,  this  iterative  process  allows  the  solving  of  large  matrix  problems  which 
tend  to  become  cumbersome  w'ith  other  common  methods,  e.g.,  Linear  Programming. 
Very-  little  is  known  about  the  convergence  properties  of  Fictitious  Play,  and  it  is  the 
purpose  of  this  study  to  experimentally  examine  the  rate  of  convergence  for  a  specific 
two-person  zero-sum  area  search  game. 

B.  PROBLEM  STATEMENT 

This  study  was  motivated  by  an  area  search  game  in  which  a  searcher  looks  for  an 
evading  target,  each  moving  among  a  finite  number  of  cells  in  discrete  time  periods.  The 
searcher  and  target  each  independently  select  a  path  through  the  search  area  (a  pure 
strategy)  or  some  probabilistic  combination  of  paths  (a  mixed  strategy).  These  paths 
are  feasible  combinations  of  cells  serially  connected  over  a  time  period  7.  That  is.  if  the 
current  cell  is  i,  the  next  cell  must  be  selected  from  a  set  C,  of  neighboring  cells.  Al¬ 
though  it  will  be  assumed  here,  it  is  not  necessary  that  cell  i  have  the  same  set  of 
neighbors  for  both  searcher  and  target.  For  each  game  play,  the  searcher  and  target 
select  feasible  T-time  period  paths.  The  payofT  is  the  expected  number  of  times  the 
searcher  and  target  are  in  the  same  cell  in  the  same  time  period.  The  searcher  attempts 
to  maximize  and  the  target  minimize  this  payofT. 

Because  the  number  of  paths  can  be  quite  large,  Fictitious  Play  was  selected  to  solve 
this  game.  It  soon  became  evident  that  the  rate  of  convergence  of  Fictitious  Play  would 
determine  whether  or  not  it  was  a  useful  solution  method. 
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C.  PREVIOUS  WORK 

Fictitious  Play  was  first  introduced  by  GAV.  Brown  (Ref.  1)  as  an  iterative  process 
for  solving  finite  two-person  zero-sum  games.  Brown  hypothesized  that  the  rate  of 
convergence  to  the  value  of  the  game  was  proportional  to  in  ,  where  n  is  the  number 
of  fictitious  play  iterations.  Julia  Robinson  [Ref.  2]  proved  the  process  converged,  thus 
formally  demonstrating  its  potential  validity  as  a  solution  method.  J.M.  Danskin 
[Ref.  3]  showed  that  Fictitious  Play  applies  to  continuous  two-peTson  zero-sum  games 
as  well.  S.  Karlin  [Ref.  4]  hypothesized  that  the  rate  of  convergence  was  1  ,\fn  ,  but 
further  asserted  that  in  practice  it  could  be  expected  to  converge  more  rapidly.  No 
further  work  or  relevant  information  on  the  convergence  properties  of  Brown's  Fictitious 
Play  had  been  discovered  up  to  the  time  of  this  study. 


II.  METHODOLOGY 


A.  FINITE  MATRIX  GAME 

The  area  search  game  presented  here  can  be  represented  as  a  finite  matrix  game. 
The  elements  of  the  payoff  matrix  are  the  number  of  detections  of  the  target  by  the 
searcher,  i.e.,  the  number  of  time  periods  when  the  searcher  and  target  are  in  the  same 
cell.  Figure  1  on  page  4  depicts  this  matrix  game,  where 
a,  =  pure  strategy  i  of  searcher  (a  feasible  T-time  period  path), 

/?,  =  pure  strategy  j  of  target  (a  feasible  T-time  period  path),  and 
rtJ  =  number  of  detections 

for  i=  1,2 ,...,«  and  j  —  1,2 


I 

I 


» 


TARGET 

ft  ft  ft  • 


o  • 


a 


m 


a, 


a. 


a. 


SEARCHER 


a. 


■ 

■ 

a 

| 

Figure  1.  Finite  Matrix  Area  Search  Game 


In  this  finite  matrix  game,  the  searcher  can  calculate  a  security  level  l's  ,  as 

Vs  =  max[minty]. 


Similarly  the  target  can  calculate  a  security  level  VT ,  as 

rr=  min[maxrj7]. 

i  j 

In  all  cases,  Fs  <  VT  and  if  Fs  =  VT  =  V  ,  the  game  has  an  equilibrium  or  saddle  point. 
Associated  with  this  saddle  point  is  the  value  of  the  game  V,  and  optimal  pure  strategies 
a*  and  /?*  . 

» 
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In  games  where  equilibrium  points  do  not  exist,  mixed  strategies  can  be  used  to 
specify  a  value  of  the  game.  A  mixed  strategy  is  a  set  of  pure  strategies  o.,  ,  that  are 
weighted  with  probabilities  x,  ,  where 


'y'xj  =  1  and  xt  ;>  0. 


For  the  searcher,  a  mixed  strategy  is  denoted  as  X  —  (jqa,,  x2a2,...,  x„a„),  and  for  the 
target,  it  is  denoted  as  Y  —  With  mixed  strategies,  the  security  levels 

for  the  searcher  and  target  are  calculated  respectively  by: 


F5  =  max  min 2_jX&jrv 
ij 


and 


Vt 


=  min  max 
y  x 


John  von  Neumann  [Ref.  5]  showed  that 

r5=rr=r, 

where  V  is  the  value  of  the  game.  The  searcher  mixed  strategy  which  achieves  V  is 
A'*  ,  the  searcher  s  optimal  mixed  strategy;  and,  the  target  mixed  strategy  which  achieves 
V  is  V*,  the  target's  optimal  mixed  strategy. 

B.  FICTITIOUS  PLAY 

Fictitious  Play  is  an  iterative  method  for  approximating  the  V,  X*  and  }'*  for  a 
two-person  zero-sum  game.  This  method  was  first  introduced  by  George  W.  Brown 
[Ref.  1]  and  is  conceptually  described  as  follows: 

The  iterative  method  in  question  can  be  loosely  characterized  by  the  fact  that  it  rests 
on  the  traditional  statistician's  philosophy  of  basing  future  decisions  on  the  relevant 
past  history.  Visualize  two  statisticians,  perhaps  ignorant  of  min-max  theory,  play¬ 
ing  many  plays  of  the  same  discrete  zero-sum  game.  One  might  naturally  expect  a 
statistician  to  keep  track  of  the  opponent's  past  play  and,  in  the  absence  of  a  more 
sophisticated  calculation,  perhaps  to  choose  at  each  play  the  optimum  pure  strategy 
against  the  mixture  represented  by  all  the  opponent  s  past  plays.  For  calculation 
purposes  the  rule  used  here  is  that  strategies  will  be  named  in  turn  for  each  side, 
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choosing  at  each  turn  a  pure  strategy  which  is  optimal  against  the  cumulative  his¬ 
tory  of  the  opponent's  play  to  date.  [Ref.  1;  p.  374] 

The  method  is  a  relatively  simple  iterative  process  that  directs  a  player  to  select  an  op¬ 
timal  pure  strategy  in  response  to  the  current  empirical  mixed  strategy  of  his  opponent. 
Applied  to  the  area  search  game,  an  iteration  of  this  procedure  consists  of  the  following 
steps. 

1.  Based  on  an  equal  weighting  of  all  the  searcher's  pure  strategies  observed  so  far 
by  the  target,  the  target  selects  the  best  pure  strategy  response. 

2.  A  lower  bound  on  the  value  of  the  game  is  computed  as  the  expected  number  of 
detections  when  the  searcher  plays  his  current  mixed  strategy  and  the  target  selects 
the  best  pure  strategy  response. 

3.  Based  on  an  equal  weighting  of  all  the  target's  pure  strategies  observed  so  far  by 
the  searcher,  the  searcher  selects  the  best  pure  strategy  response. 

4.  An  upper  bound  on  the  value  of  the  game  is  computed  as  the  expected  number  of 
detections  when  the  target  plays  his  current  mixed  strategy  and  the  searcher  selects 
the  best  pure  strategy  response.  (Go  to  step  1  for  next  iteration) 

The  procedure  begins  with  the  target  assuming  an  arbitrary1  searcher  strategy.  As  the 
number  of  iterations  of  the  game  are  increased,  the  upper  and  lower  bounds  on  the  value 
of  the  game  converge  toward  the  actual  value,  and  any  converging  subsequence  of  the 
empirical  mixed  strategies  is  an  optimal  mixed  strategy.  The  convergence  rate  has  been 
observed  to  be  quite  slow,  so  an  effective  solution  might  require  a  large  number  of  Fic¬ 
titious  Play  iterations.  The  process  is  considered  complete  when  the  difference  between 
the  bounds  on  the  value  of  the  game  is  sufficiently  small.  At  this  point  an  approximate 
value  of  the  game  and  approximate  optimal  strategies  for  both  players  are  obtained. 

The  empirical  mixed  strategies  after  the  Ath  iteration  of  Fictitious  Play,  A‘  and  B‘, 
are  calculated  from  the  relative  frequencies  of  all  the  previously  selected  pure  strategies 
of  the  searcher  and  target  respectively.  That  is,  consider  the  game  that  has  been  repli¬ 
cated  k  times  and  the  searcher  has  selected  pure  strategies  (a1,  «2,...,  a*),  where  a-  is  the 

pure  strategy  chosen  in  the  jih  replication.  If  r ,  denotes  the  number  of  times  pure 

r, 

strategy  a,  is  used,  then  the  pure  strategy  a,  is  weighted  with  the  relative  frequency  — 
This  results  in  the  empirical  mixed  strategy 

1  '2  rn  , 

A  (  f.  £  ®n) 

for  the  searcher  and  similarly 
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for  the  target. 

1.  Convergence  Rate 

The  convergence  rate  of  the  upper  and  lower  bounds  of  the  value  of  the  game 
was  first  hypothesized  by  G.W.  Brown  [Ref.  1]  to  be  1  In  ,  where  n  is  the  number  of  it¬ 
erations.  He  supported  this  hypothesis  by  relating  the  iterative  method,  as  a  difference 
equation,  to  a  set  of  differential  equations  for  which  a  convergence  rate  could  be  shown. 
The  most  recently  found  discussion  in  the  literature  on  the  convergence  rate  was  pre¬ 
sented  by  Samuel  Karlin  [Ref.  4],  He  stated: 

It  is  conjectured  that  the  process  converges  at  a  rate  1  \ik  ,  where  k  is  the  number 
of  iterations.  In  actual  cases,  it  is  found  that  the  process  is  far  more  efficient  than 
is  expected  theoretically. [Ref.  4:  p.  183[ 

No  reference  was  made  to  how  or  where  this  conjectured  rate  of  convergence  was  de¬ 
termined.  It  appears  that  the  rate  of  convergence  of  Fictitious  Play  is  unclear  and  re¬ 
quires  further  investigation. 

C.  DYNAMIC  PROGRAMMING 

To  use  Fictitious  Play  to  solve  this  area  search  game,  both  the  searcher  and  target 
must  be  able  to  calculate  the  best  pure  strategy  response  to  any  mixed  strategy  of  the 
opponent.  If  the  area  search  game  were  small  enough  to  allow  a  total  enumeration  of 
searcher  and  target  pure  strategies  (i.e.,  the  payoff  matrix  could  be  completely  specified), 
then  choosing  the  best  pure  strategy  response  would  be  simple.  Assuming,  for  example, 
that  the  searcher  is  the  row  player,  the  best  row  i  to  play  against  a  mixed  strategy  is 


argmax 

i 


L  j  J  < 


where 

y,  -  probability  of  target  selecting  column  j,  and 
r.j  -  column  j  of  the  payoff  matrix. 

The  target  could  likewise  find  the  best  column  response  to  any  probabilistic  combination 
of  rows. 
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For  the  area  search  problem  presented  here,  it  is  assumed  that  the  large  number  of 
possible  pure  strategies  (i.e.,  paths)  available  for  the  searcher  and  target  makes  total 
enumeration  impractical.  Another  method  must  be  used  to  determine  pure  strategy  re¬ 
sponses.  The  procedure  employed  is  Dynamic  Programming.  Assume,  for  example,  that 
the  target  plays  a  mixed  strategy  which  is  known  to  the  searcher.  That  is,  the  searcher 
knows  all  the  paths  that  the  target  might  select  and  the  probability  of  the  target  selecting 
each  path.  From  this  information  the  searcher  computes 

*70,;=  i,...,iV  and  /  =  i,...,r , 

which  is  the  probability  of  the  target  being  in  cell  /  in  time  period  t.  These  probabilities 
contain  all  the  information  necessary  for  the  searcher  to  select  his  best  response.  The 
searcher,  as  it  turns  out,  does  not  care  what  mixed  strategy  the  target  uses  as  long  as  the 
searcher  can  determine  the  }’,(/')  values. 

The  searcher  can  now  use  Dynamic  Programming  to  compute  the  best  pure  strategy 
response  to  }',(/)•  The  recursion  for  /=1,...,A’  and  i  —  2,...,T  \% 

-  max {}’,(/)  +  V,(j)}  and 
j*  c. 

4_i (')  “  argmax{  )’,(/)  +  Vtf)), 

j*c, 


where 

1  ’,(/')  =  maximum  obtainable  expected  number  of  target  detections  from  time  t  +  1  to 
T  when  the  searcher  starts  in  time  period  t  and  is  in  cell  j. 

C,  =  the  set  of  cells  accessible  from  cell  i  in  one  time  period  by  the  searcher,  and 
d,{i)  =  the  best  next  cell  to  search  given  the  searcher  is  in  cell  i  in  time  period  t. 

The  recursion  begins  with 

V7 </)  =  0  ,  i  -  1 

By  solving  a  similar  dynamic  program,  the  target  can  determine  his  best  pure  strategy- 
response  to  any  searcher  cell  occupancy  probabilities,  X,(i). 

To  demonstrate  the  validity  of  this  recursion,  it  is  observed  that 
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V ._,(/)  =  max{E[&  detections  at  time  /|search  is  in  cell  j  at  time  /] 

c, 

+  E[#  detections  from  time  /  +  1  to  7]search  is  in  cell  j  at 

time  i  and  conducted  optimally  from  time  /  +  1  to  time  7]} 

=  ma +  Vtf)). 

}•  c, 

The  first  equality  follows  from  the  definition  of  VjJ)  and  the  fact  that  the  expected  value 
of  a  sum  of  random  variables  is  the  sum  of  the  expected  values.  The  second  equality 
results  from  conditioning  on  the  target's  cell  at  time  t  and  (again)  the  definition  of  VfJ). 

It  is  noted  that  the  searcher's  problem  is  that  of  finding  the  longest  (i.e.,  most  prof¬ 
itable)  path  through  the  N  x  T  acyclic  network  in  Figure  2  on  page  10.  When  the 
searcher  reaches  node  j  in  time  period  t,  the  payoff  Y,(j)  is  received.  Arcs  connect  each 
cell  i  with  all  cells  in  the  set  of  accessible  cells  Cr 
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1.  Updating  Cell  Occupancy  Probabilities,  X,{i)  and  Y,(i) 

In  each  iteration  of  fictitious  play,  the  searcher  and  target  update 
Y,(0  and  A',(i)  respectively,  based  on  the  number  of  iterations  performed  so  far  and  the 
opponents  most  recently  observed  pure  strategy.  Assigning  equal  weights  at  all  observed 
pure  strategies,  the  update  procedure  is  straight  forward.  If  the  searcher  was  in  cell  i  at 
time  t  in  the  most  recently  observed  pure  strategy,  then 

and  if  the  searcher  was  not  in  cell  i  at  time  i , 
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Likewise,  if  the  target  was  in  cell  /  at  time  t  in  the  most  recently  observed  pure  strategy, 
then 

ff('')  =  (i=-L)n*"'W  +  y, 

and  if  the  target  was  not  in  cell  i  at  time  i , 

Here  k  is  the  current  iteration  number  and  X*(i)  and  Kf(#)  are  the  empirical  cell  occu¬ 
pancy  probabilities  after  the  klh  iteration. 
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III.  DATA  GENERATION 


A.  PROGRAM  DESCRIPTION 

The  upper  and  lower  bound  data  required  for  this  study  were  generated  by  a  modi¬ 
fied  version  of  an  existing  computer  program  provided  by  Professor  J.  Eagle  at  the  Naval 
Postgraduate  School  in  Monterey,  California.  The  modified  version  was  developed  to 
allow  a  greater  flexibility  for  variable  manipulation.  Conceptually  the  program  was  di¬ 
vided  into  three  major  areas  dealing  with  Game  Theory,  Fictitious  Play  and  Dynamic 
Programming. 

1.  Finite  Matrix  Game 

The  initial  set  up  of  the  finite  matrix  game  required  inputs  from  the  user  that 
included  search  area  size,  duration  of  search,  replications  of  game  and  initial  strategy  of 
players.  Within  this  portion,  all  the  possible  decision  paths  (states)  for  adjacent  cell  path 
movements  during  a  time  period  (stage)  were  determined.  The  initial  inputs  and  adja¬ 
cent  cell  paths  were  required  for  transition  into  the  fictitious  play  portion. 

2.  Fictitious  Play 

The  Fictitious  Play  portion  was  the  driver  of  the  iterative  process.  It  was  re¬ 
sponsible  for  determining  the  mixed  strategies  of  the  players  and  evaluating  the  bounds 
on  the  value  of  the  game.  As  the  best  pure  strategies  were  determined,  the  bounds  were 
reevaluated.  The  iterative  process  was  started  by  the  selection  of  an  optimal  pure 
strategy  for  the  target  against  the  searcher's  initial  inputed  strategy.  A  new  lower  bound 
on  the  value  of  the  game  was  computed.  If  it  was  greater  (i.e.,  tighter)  than  the  current 
lower  bound,  it  was  retained.  Otherwise  the  new  lower  bound  was  ignored.  A  mixed 
strategy  for  the  target  was  then  determined  from  the  old  mixed  strategy  and  the  pure 
strategy  just  selected.  The  process  of  selecting  the  best  pure  strategy,  calculating  and 
evaluating  the  opponent's  bound  on  the  game,  and  determining  a  new  mixed  strategy 
was  then  accomplished  for  the  searcher  against  the  target's  current  mixed  strategy. 
Conducting  this  process  once  for  each  player  constituted  a  replication  of  the  game.  The 
determination  of  the  optimal  pure  strategies  and  the  bounds  on  the  value  of  the  game 
for  each  player  required  the  Dynamic  Programming  portion  of  the  program. 

3.  Dynamic  Programming 

Dynamic  Programming  provided  the  optimization  procedure  for  the  iterative 
process  and  was  considered  the  optimizer  portion.  It  determined  the  best  search  paths 
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for  each  player  in  response  to  the  mixed  strategies  of  his  opponent.  The  payoff  (ex¬ 
pected  number  of  detections)  was  maximized  for  the  searcher  and  minimized  for  the 
target.  These  paths  became  the  optimal  pure  strategies  required  for  the  calculation  of 
empirical  mixed  strategies  in  the  iterative  process.  Associated  bounds  on  the  value  of 
the  game  were  calculated  for  each  iteration. 

B.  PROGRAM  VALIDATION 

The  computer  program  was  validated  with  results  from  a  Linear  Programming  sol¬ 
ution  to  the  area  search  game,  provided  by  Professor  A.  Washburn  at  the  Naval  Post¬ 
graduate  School  in  Monterey,  California.  A  comparison  of  the  Linear  Programming 
solutions  and  the  Fictitious  Play  approximations  is  presented  in  Table  1.  The  fictitious 
play  approximations  were  the  computed  midpoints  between  the  upper  and  lower  bounds 
on  the  value  of  the  game  after  50,000  replications.  The  difference  between  the  solutions 
of  the  two  approaches  is  represented  as  an  absolute  value.  The  validity  of  the  fictitious 
play  computer  program  was  supported  by  these  results. 


Table  1.  COMPARISON  OF  LINEAR  PROGRAMMING  AND  FICTITIOUS 
PLAY  SOLUTIONS  (Value  of  the  Game) 


MATRIX 

SIZE 

a  TIME 
PERIODS 

LINEAR  PRO¬ 
GRAMMING 

FICTITIOUS 

PLAY 

ABSOLUTE 

DIFFERENCE 

1x6 

11 

1.2900 

1.2902 

0.0002 

3x3 

6 

0.3548 

0.3550 

0.0002 

4x4 

S 

0.2554 

0.0001 

5x5 

10 

0.2012 

0.2015 

0.0003 

6x6 

12 

0.1666 

0.1665 

0.0001 
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IV.  DATA  ANALYSIS 


A.  CONVERGENCE  PROPERTIES 

In  validating  the  Fictitious  Play  approach,  it  became  apparent  that  an  examination 
of  convergence  properties  was  needed.  The  only  relevant  information  available  per¬ 
tained  to  convergence  rates  and  was  conflicting  [Ref.  1,4].  In  order  to  reasonably  predict 
a  solution,  an  understanding  of  convergence  characteristics  was  required.  The  focus  of 
the  study  became  the  investigation  of  convergence  properties  with  specific  emphasis  on 
convergence  symmetry  and  rate. 

B.  CONVERGENCE  SYMMETRY 

The  upper  and  lower  bounds  on  the  value  of  the  game  converge  to  a  solution  as  the 
number  of  replications  increases.  This  was  proven  mathematically  by  J.  Robinson  [Ref. 
2].  It  was  observed  in  this  study  that  the  bounds  tended  to  converge  symmetrically. 
Graphically  this  is  displayed  for  a  4x4  matrix  in  Figure  3.  This  characteristic  was  pres¬ 
ent  in  all  cases  examined,  which  included  various  matrix  sizes,  shapes  and  initial  player 
positionings.  Additional  graphic  presentations  are  located  in  Appendix  A. 


CONVERGENCE  OF  UPPER  AND  LOWER  BOUNDS  WITH  MIDPOINT 
4x4  MATRIX  8  TIME  PERIODS  50K  REPLICATIONS 


Figure  3.  Convergence  of  Upper  and  Loner  Bounds  With  Midpoint  Solutions 

Since  the  solution  to  the  game  lies  lies  between  the  bounds  and  the  bounds  appear 
to  converge  symmetrically,  using  the  midpoint  of  the  bounds  as  an  approximation  to  the 
value  of  the  game  seemed  to  be  a  reasonable  approach.  This  method  proved  to  be  very 
successful  as  evidenced  in  Table  1  on  page  13.  Further  investigation  revealed  that 
through  the  use  of  the  midpoint  method  an  accurate  approximation  could  be  predicted 
without  requiring  a  large  number  of  replications.  This  is  supported  by  a  comparison  of 
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the  midpoint  and  actual  solution  for  various  replications  in  Table  2.  Additional  com¬ 
parisons  are  available  in  Appendix  B. 

Table  2.  MIDPOINT  SOLUTIONS  FOR  MULTIPLE  REPLICATIONS 


MATRIX 

SIZE 

REPLICATIONS 

(xlOOO) 

MIDPOINT 

SOLUTION 

ACTUAL 

SOLUTION 

ABSOLUTE 

DIFFERENCE 

5 

0.2547 

0.0006 

-10 

0.2556 

0.0003 

4x4 

20 

0.2557 

0.2553 

0.0004 

30 

0.2556 

0.0003 

40 

0.2555 

0.0002 

50 

0.2554 

0.0001 

Normally  the  fictitious  play  process  is  considered  complete  when  the  difference  be¬ 
tween  the  bounds  achieves  some  specified  positive  tolerance  level.  A  small  tolerance 
level  results  in  an  accurate  solution.  It  was  observed  that  as  the  game  increased  in  size, 
more  replications  were  required  to  obtain  the  same  tolerance  level.  This  is  illustrated  in 
Figure  4. 


COMPARISON  OF  DIFFERENCE  BETWEEN 
UPPER  AND  LOWER  BOUNDS  FOR  VARIOUS  MATRIX  SIZES 


Figure  4.  Comparison  of  Separation  Between  Bounds  for  Various  Matrix  Sizes 

Figure  5  on  page  18  compares  the  convergence  of  the  midpoint  solution  to  that  of 
the  upper  bound  for  a  typical  game.  The  midpoint  was  in  all  cases  observed  to  achieve 
a  much  more  accurate  estimate  of  the  value  of  the  game  than  provided  by  either  of  the 
bounds.  For  example,  18,000  replications  were  required  in  a  4x4  game  to  bring  the  up¬ 
per  bound  within  0.005  of  the  actual  solution;  and  only  400  replications  were  required 
to  bring  the  midpoint  to  the  same  absolute  deviation. 
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COMPARISON  OF  CONVERGENCE  OF  BOUND  AND  MIDPOINT 
4x4  MATRIX  B  TIME  PERIODS  50,000  REPLICATIONS 


Figure  5.  Comparison  of  Convergence  of  an  Upper  Bound  and  Midpoint 

By  examining  further  comparisons  in  Appendix  C,  it  can  be  seen  that  as  the  game  size 
increases,  the  number  of  replications  required  by  the  bound  to  guarantee  a  preset  abso¬ 
lute  deviation  also  increases.  The  number  of  replications  required  to  give  the  same  ab¬ 
solution  deviation  for  the  midpoint  solution  appears  to  be  considerably  less  influenced 
by  increasing  game  sizes.  A  comparison  between  the  bound  and  midpoint  replications 
required  to  insure  a  0.005  absolute  deviation  for  various  game  sizes  is  provided  in 
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Table  3.  The  midpoint  method  is  seen  to  provide  an  accurate  approximation  to  the 
solution  very  quickly  and  is  not  greatly  influenced  by  the  game  size. 

Table  3.  REPLICATIONS  TO  INSURE  0.005  ABSOLUTE  DEVIATION  FOR 
BOUND  AND  MIDPOINT 


MATRIX  SIZE 

REPLICATIONS  FOR 
0.005  DEVIATION 
WITH  BOUND 

REPLICATIONS  FOR 
0.005  DEVIATION 
WITH  MIDPOINT 

3x3 

5,500 

400 

4x4 

18,000 

400 

5x5 

34,800 

400 

6x6 

>  50.000 

500 

It  can  be  concluded  from  the  above  observations  that  for  the  games  examined  the 
bounds  converge  symmetrically  to  a  solution.  The  midpoint  between  the  bounds  con¬ 
verges  much  more  rapidly  to  the  solution  than  do  the  bounds.  Additionally,  the  mid¬ 
point  method  is  apparently  not  greatly  hindered  by  an  increase  in  game  size. 

C.  CONVERGENCE  RATE 

The  convergence  rate  for  the  Fictitious  Play  process  is  not  clearly  understood.  For 
the  area  search  game,  the  convergence  rates  for  various  size  games  were  experimentally 
found  to  be  slower  than  the  hypothesized  rate  I  n.  The  rate  of  convergence  of  the 
bounds  for  various  games  examined  and  the  hypothesized  rates  are  displayed  in 
Figure  6. 


COMPARISON  OF  CONVERGENCE 
FOR  VARIOUS  MATRIX  SIZES  WITH  HYPOTHESIZED  RATES 


Figure  6.  Comparison  of  Convergence  of  Upper  Bound  for  Various  Game  Sizes 

It  can  be  clearly  seen  in  Figure  6,  that  as  the  game  sizes  increase  the  associated  time 
{i.e.,  number  of  replications)  required  to  reach  a  specific  deviation  also  increases.  That 
is,  the  bounds  take  longer  to  converge  for  a  larger  games. 

Experimentally  the  data  were  fitted  with  the  power  function  {y  =  o.ne),  where  n  re¬ 
presented  the  number  of  replications.  This  fit  was  accomplished  by  noting  that  for  this 
power  function 

In  y  =  In  a  +  /?  In  n 

which  allowed  a  linear  regression  of  In  y  versus  In  n  to  be  done.  Because  of  the 
large  amount  of  data,  every  fiftieth  replication  was  fitted.  The  exponent  (/?).  which  re- 
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presents  the  convergence  rate,  appeared  to  range  from  the  lower  hypothesized  value  of 
-1.0  and  generally  increased  as  the  game  size  increased.  The  largest  game  examined  was 
a  10x10  matrix  with  only  6,850  replications  due  to  the  large  amount  of  computer  time 
required  to  generate  the  data.  Table  4  provides  the  values  of  (a)  and  {0)  for  the  fitted 
power  functions  of  various  games  examined.  In  all  cases,  the  power  function  provided 
an  excellent  fit  to  the  data.  This  is  supported  by  the  closeness  to  1.0  of  the  multiple 
correlation  coefficient  (R  SQUARE),  displayed  in  Table  4.  A  graphical  presentation  of 
the  fitted  data  is  provided  in  Appendix  C. 


Table  4.  POWER  FUNCTION  FIT  OF  DATA  WITH  R  SQUARE  VALUES 


MATRIX  SIZE 

POWER  FUNCTION  {y  =  o. rfi) 

R  SQUARE 

a 

0 

2x2 

0.29366 

-0.79710 

0.973 

3x3 

0.29687 

-0.65306 

0.972 

4x4 

0.40S66 

-0.59259 

0.993 

5x5 

0.68224 

-0.60602 

0.995 

6x6 

0.65516 

-0.56800 

0.997 

8x8 

1.19410 

-0.55711 

0.993 

10x10 

0.78094 

-0.44950 

0.991 

It  can  be  concluded  for  this  type  of  area  search  game,  that  the  Fictitious  Play  ap¬ 
proach  has  a  convergence  rate  that  is  representative  of  a  power  function.  The  observed 
0  values  ranged  from  -0.7971  for  the  smallest  game  to  -0.4495  for  the  largest.  These  data 
suggest  that  Brown's  hypothesized  rate  of  In  (i.e.,  0  =  -1)  is  in  general  too  optimistic. 
Additionally,  Karlin's  rate  of  1  v«  (i.e.,  0  =  -0.5)  may  also  be  too  optimistic  for  games 
as  large  or  larger  than  the  10x10,  20-time  period  game  examined  here. 
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V.  CONCLUSIONS 


A.  SUMMARY  OF  CONVERGENCE  PROPERTIES 

The  fictitious  play  approach  to  the  two-person  zero-sum  area  search  game  was 
successfully  implemented.  Convergence  properties  of  the  process,  for  this  type  of  game, 
were  investigated  and  the  following  conclusions  were  reached. 

•  For  the  area  search  games  examined,  the  upper  and  lower  bounds  on  the  value  of 
the  game  converge  symmetrically  toward  a  solution  as  the  number  of  replications 
of  the  game  is  increased. 

•  Because  of  the  symmetrical  cons'ergence,  the  midpoint  between  the  bounds  pro¬ 
vides  an  accurate  approximation  to  the  solution. 

•  The  midpoint  solution  converges  much  more  quickly  to  the  actual  solution  than 
do  the  bounds  and  is  apparently  not  greatly  influenced  by  the  size  of  the  game. 

•  The  convergence  rate  of  the  process  is  representative  of  a  power  function 
(y  =  an'1).  Experimentally,  the  exponent  (fi)  was  observed  to  vary  between  -0.7971 
and  -0.4495,  and  generally  increased  with  the  size  of  the  game. 

•  The  convergence  of  the  bounds  becomes  slower  as  the  size  of  the  game  is  increased. 

By  observing  the  convergence  characteristics  of  the  Fictitious  Play  process,  an  approach 
for  predicting  an  accurate  approximation  of  the  solution  was  developed.  This  approach, 
the  midpoint  method,  required  less  replications  of  the  game  and  provided  an  accure  te 
approximation  of  the  solution.  Because  of  the  increased  efficiency  and  capability  to  ac¬ 
curately  predict  a  solution,  the  Fictitious  Play  process  should  be  considered  a  possible 
approach  to  solving  area  search  games  and  warrants  further  investigation. 

B.  RECOMMENDATIONS  FOR  FUTURE  STUDY 

The  Fictitious  Play  process  provides  a  relatively  simple  approach  to  solving  the  area 
search  game.  It  not  only  produces  an  accurate  approximation  of  the  solution  of  the 
game,  but  also  provides  the  capability  to  determine  a  nearly  optimal  strategy  for  each 
player.  The  following  topic  is  recommended  as  a  possible  area  for  further  investigation 
and  future  study. 

1.  Comparison  of  Linear  Programming  and  Fictitious  Play  Approaches 

Solving  the  area  search  game  with  the  use  of  computer  resources  can  be  ap¬ 
proached  by  several  methods.  One  very  promising  method  is  the  Linear  Programming 
approach  of  Washburn.  The  major  advantage  of  this  approach  is  that  it  does  not  re¬ 
quire  a  large  amount  of  CPC  time  and  gives  exact  answers.  However,  it  can  be  very 


demanding  on  memory  resources,  especially  as  the  size  of  the  games  increases.  In  com¬ 
parison,  the  fictitious  play  approach  requires  minimal  memory  resources  but  is  ham¬ 
pered  by  the  large  amount  of  CPU  time  required  and  gives  approximate  solutions.  A 
comparison  between  the  Linear  Programming  and  the  Fictitious  Play  approach  is  re¬ 
commended  for  future  study,  with  emphasis  on  the  tradeoffs  between  the  resources  of 
CPU  time  and  memory  space. 
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VALUE  OF  THE  GAME  VALUE  OF  THE  GAME 


APPENDIX  A.  COMPARISONS  OF  CONVERGENCE  OF  UPPER  AND 
LOWER  BOUNDS  ON  VALUE  OF  THE  GAME 


CONVERGENCE  OF  UPPER  AND  LOWER  BOUNDS  WITH  MIDPOINT 


REPLICATIONS  (*10) 


REPLICATIONS  (*10) 


Figure  7.  Convergence  of  Upper  and  Lower  Bounds  With  Midpoint:  3x3  and  4x4 
Matrix 
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VALUE  OF  THE  GAME  VALUE  OF  THE  GAME 

0.14  0.1  B  0.1  B  0.20  0.1 B  0.1B  OJO  0.22  0.24 


CONVERGENCE  OF  UPPER  AND  LOWER  BOUNDS  WITH  MIDPOINT 


5x5  MATRIX  10  TIME  PERIOOS  50,000  REPLICATIONS 


Figure  8.  Convergence  of  Upper  and  Lower  Bounds  With  Midpoint:  5x5  and  6x6 
Matrix 


VALUE  OF  THE  GAME 
1-2*  1.26  1.28  1JO  1J 


CONVERGENCE  OF  UPPER  AND  LOWER  BOUNDS  WITH  MIDPOINT 


Figure  9.  Convergence  of  Upper  and  Lower  Bounds  With  Midpoint:  1x6  Matrix 
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APPENDIX  B.  COMPARISON  OF  MIDPOINT  AND  ACTUAL 

SOLUTION 


Table  5.  MIDPOINT  SOLUTIONS  FOR  MULTIPLE  REPLICATIONS 


MATRIX 

SIZE 

REPLICATIONS 

(xlOOO) 

MIDPOINT 

SOLUTION 

ACTUAL 

SOLUTION 

ABSOLUTE 

DIFFERENCE 

3x3 

5 

0.3556 

0.3548 

0.0008 

10 

0.3549 

0.0001 

20 

0.3551 

0.0003 

30 

0.3549 

0.0001 

40 

0.3549 

0.0001 

50 

0.3550 

0.0002 

4x4 

5 

0.2547 

0.2553 

0.0006 

10 

0.2556 

0.0003 

20 

0.2557 

0.0004 

30 

0.2556 

0.0003 

40 

0.2555 

0.0002 

50 

0.2554 

0.0001 

5x5 

5 

0.2024 

0.2012 

0.0012 

10 

0.2020 

O.OOOS 

20 

0.2019 

0.0007 

30 

0.2013 

0.0001 

40 

0.2010 

0.0002 

50 

0.2015 

0.0003 
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Table  6.  MIDPOINT  SOLUTIONS  FOR  MULTIPLE  REPLICATIONS 
(CONT.)  _ 


MATRIX 

SIZE 

REPLICATIONS 

(xlOOO) 

MIDPOINT 

SOLUTION 

ACTUAL 

SOLUTION 

ABSOLUTE 

DIFFERENCE 

6x6 

5 

0.1667 

0.1666 

0.0001 

10 

0.1661 

0.0005 

20 

0.1662 

0.0004 

30 

0.1664 

0.0002 

40 

0.1663 

0.0003 

50 

0.1665 

0.0001 

1x6 

5 

1.2899 

1.2900 

0.0001 

10 

1.2902 

0.0002 

20 

1.2902 

0.0002 

30 

1.2903 

0.0003 

40 

1.2901 

0.0001 

50 

1.2902 

0.0002 
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JTTON  ABSOLUTE  DEVIATION  FROM  SI 
0.020  0  0.005  0.010  0.015 


APPENDIX  C.  COMPARISON  OF  CONVERGENCE  TO  THE  ACTUAL 
SOLUTION  OF  A  BOUND  AND  MIDPOINT 


Figure  10.  Convergence  of  a  Bound  and  Midpoint  for  a  3x3  and  4x4  Matrix  Game 
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ABSOLUTE  DEVIATION  FROM  SOLUTION  ABSOLUTE  DEVIATION  FROM  St 
0  0  005  0.010  0.015  0.020  0  0.005  0.010  0.015 


COMPARISON  OF  CONVERGENCE  OF  BOUND  AND  MIDPOINT 


6x6  MATRIX  12  TIME  PERIODS  50,000  REPLICATIONS 


Figure  11.  Convergence  of  a  Bound  and  Midpoint  for  a  5x5  and  6x6  Matrix  Game 
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DEVIATION  FROM  MIOPOINT  SOLUTION  DEVIATION  FROM  MIDPOINT  SOLUTION 
0  0  02  0.04  0.06  0  0.02  0.04  0.06 


APPENDIX  D.  POWER  FUNCTION  FITTING  OF  DATA  FOR  VARIOUS 

GAME  SIZES 


CONVERGENCE  DATA  FITTED  WITH  POWER  FUNCTION 


3x3  MATRIX  6  TIME  PERIODS  50,000  REPLICATIONS 


Figure  12.  Convergence  Data  From  3x3  and  4x4  Matrix  Fitted  With  Power  Func¬ 
tion 


( 
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DEVIATION  FROM  MIDPOINT  SOLUTION  DEVIATION  FROM  MIOPOtNl 
0  0.02  0.04  0.06  0  0.02  0.04 


CONVERGENCE  DATA  FITTED  WITH  POWER  FUNCTION 


Figure  13.  Convergence  Data  From  5x5  and  6x6  Matrix  Fitted  With  Power  Func¬ 
tion 
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